Adaptive Geometric Multiscale Approximations for Intrinsically Low-dimensional Data

نویسندگان

  • Wenjing Liao
  • Mauro Maggioni
چکیده

We consider the problem of efficiently approximating and encoding high-dimensional data sampled from a probability distribution ρ in R, that is nearly supported on a d-dimensional setM for example supported on a d-dimensional Riemannian manifold. Geometric MultiResolution Analysis (GMRA) provides a robust and computationally efficient procedure to construct low-dimensional geometric approximations of M at varying resolutions. We introduce a thresholding algorithm on the geometric wavelet coefficients, leading to what we call adaptive GMRA approximations. We show that these data-driven, empirical approximations perform well, when the threshold is chosen as a suitable universal function of the number of samples n, on a wide variety of measures ρ, that are allowed to exhibit different regularity at different scales and locations, thereby efficiently encoding data from more complex measures than those supported on manifolds. These approximations yield a data-driven dictionary, together with a fast transform mapping data to coefficients, and an inverse of such a map. The algorithms for both the dictionary construction and the transforms have complexity Cn log n with the constant linear in D and exponential in d. Our work therefore establishes adaptive GMRA as a fast dictionary learning algorithm with approximation guarantees. We include several numerical experiments on both synthetic and real data, confirming our theoretical results and demonstrating the effectiveness of adaptive GMRA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiscale Geometric Dictionaries for Point-cloud Data

We develop a novel geometric multiresolution analysis for analyzing intrinsically low dimensional point clouds in high-dimensional spaces, modeled as samples from a d-dimensional set M (in particular, a manifold) embedded in R, in the regime d D. This type of situation has been recognized as important in various applications, such as the analysis of sounds, images, and gene arrays. In this pape...

متن کامل

Multi-Resolution Geometric Analysis for Data in High Dimensions

Large data sets arise in a wide variety of applications and are often modeled as samples from a probability distribution in high-dimensional space. It is sometimes assumed that the support of such probability distribution is well approximated by a set of low intrinsic dimension, perhaps even a lowdimensional smooth manifold. Samples are often corrupted by high-dimensional noise. We are interest...

متن کامل

Some recent advances in multiscale geometric analysis of point clouds

We discuss recent work based on multiscale geometric analysis for the study of large data sets that lie in high-dimensional spaces but have low-dimensional structure. We present three applications: the first one to the estimation of intrinsic dimension of sampled manifolds, the second one to the construction of multiscale dictionaries, called geometric wavelets, for the analysis of point clouds...

متن کامل

Multiscale Strategies for Computing Optimal Transport

This paper presents a multiscale approach to efficiently compute approximate optimal transport plans between point sets. It is particularly well-suited for point sets that are in high-dimensions, but are close to being intrinsically low-dimensional. The approach is based on an adaptive multiscale decomposition of the point sets. The multiscale decomposition yields a sequence of optimal transpor...

متن کامل

High-Dimensional Menger-Type Curvatures - Part I: Geometric Multipoles and Multiscale Inequalities

We define discrete Menger-type curvature of d+2 points in a real separable Hilbert space H by an appropriate scaling of the squared volume of the corresponding (d+1)-simplex. We then form a continuous curvature of an Ahlfors regular measure μ on H by integrating the discrete curvature according to products of μ (or its restriction to balls). The essence of this work, which continues in a subseq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1611.01179  شماره 

صفحات  -

تاریخ انتشار 2016